VisuNet is available for any rule-based classifier in supported data frame formats.
Data should be expressed as a data frame and needs to contain the following columns:
features - the left-hand side of the rule corresponds to comma-separated attributes and their values, type factordecision - the right hand side of the rule corresponds to decision value, type factoraccuracyRHS - the rule accuracy, type numericsupportRHS - the rule support, type numericA p-value (pValue) column is not obligatory.
| features | decision | accuracyRHS | supportRHS | pValue | |
|---|---|---|---|---|---|
| 4 | NPR2=2,CAPS2=3 | control | 1.00000 | 17 | 0.0000397 |
| 5 | MAP7=2,COX2=3 | autism | 1.00000 | 23 | 0.0000677 |
| 6 | NCKAP5L=1,PPOX=1 | control | 0.95119 | 20 | 0.0000689 |
You can use ‘line by line’ format with the option: type = 'L
rules <- data(autcon_ruleset)
vis_out <- visunet(rules, type = 'L')The rules data frame that is the output of R.ROSETTA can be directly run in VisuNet. See ?rosetta from the R.ROSETTA package for details.
The R.ROSETTA output format can be use with the option: type = 'RDF and this is the default variable.
#the rule-based model construction using R.ROSETTA
resultsRos <- rosetta(autcon)
vis_out <- visunet(resultsRos$main, type = 'RDF')VisuNet is an R package and it is implemented as Shiny Gadgets, what enable to run VisuNet as an R function, while the rule set is an argument.
require(VisuNet)
#Sample rule set for a classifier of autistic and non-autistic young males
#'Line by line' data type
data(autcon_ruleset)
#Run VisuNet
#Remember to click DONE once you finished your work with VisuNet
vis_out <- visunet(autcon_ruleset, type = 'L')The available visunet parameters are:
ruleSet - the set of rules in one of available structure: Input formatstype - character string specifying the type of the input data. Two types implemented are:
"RDF" - the R.ROSETTA output (see R.Rosetta format)"L" - “Line by line” file format (see ‘line by line’ format) The default is "RDF"NodeColorType - character string specifying the color of nodes:
"DL" - feature discretization levels, option is available for data discretized into three levels: 1 - , 2 - and 3 - . In the case of the gene expression, data discretization levels correspond to: 1 - under-expressed gene, 2 - no change gene expression and 3 - over-expressed gene."A" - color of nodes defined by the mean accuracy value for the node.The node color scale according to the mean accuracy value
CustObjectNodes and CustObjectEdges parameters are optional and can be used when the rule network customization is needed.
CustObjectNodes - a list that contains customized VisuNet output for nodes. The list needs to contain two variables:
nodes - customized VisuNet output for nodesCustCol - names of variables added/changed in the VisuNet output for nodes.See Nodes customization for details.
CustObjectEdges - a list that contains customized VisuNet output for edges. The list needs to contain two variables:
edges - customized VisuNet output for edgesCustCol - names of variables added/changed in the VisuNet output for edges.See Edges customization for details.
VisuNet displays the rule network construct for the 10% of the rules with the highest connection value. When only one decision variable is visible in the top 10% of rules, we extend the threshold to obtain the rules for all decisions. The initial values of accuracy and support are defined for this set of rules.
The rule networks parameters panel:
min Accuracy - the minimum accuracy value for the set of rules that create the rule networkmin Support - the minimum support value for the set of rules that create the rule networkShow top n nodes - enable to show the exact number of the most significant nodes according the connection value from the current rule network, set 0 to switch off the parametersColor of nodes - the node color schema. See NodeColorType in the Run Visunet section for details.The VisuNet output is a set of lists correspond to one decision variable plus one extra list for combined decision ‘all’. The lists contain information required to reproduce rule networks, i.e. data frames for nodes, edges and RulesSetPerNode - a list that shows rules for each node. Data frames for nodes and edges incorporate essential variables by a visNetwork package and additional variables that describe the quality of the node/edge obtained from the rules.
Structure of the data frame for nodes:
id - unique node id, according to the attribute value and their value from the left-hand side of the rule setlabel - the attribute variable without the ‘=value’ part from the left-hand side of the rule setDiscState - the attribute valuecolor.background - the node color, see node color types for detailsvalue - the node sizecolor.border - the color of node bordermeanAcc - the mean accuracy value from all rules that contain the nodemeanSupp - the mean support value from all rules that contain the nodeNRules - the number of rules that contain the nodePrecRules - fraction of rules that contain the nodeNodeConnection - the total connection value obtains from the rules that contain the nodetitle - information visible on the tooltipgroup - the decision variable that occurs the most frequently (>50%) in rules associated with the node, otherwise group contains all comma-separated decision variables corresponds to rules associated with the node. group defines the content of the ‘Select by decision’ drop-down box.Structure of the data frame for edges:
from, to - the pair of nodes that create the edgeconn - the connection variable obtained from the edge associated rules.connNorm - the connection variable normalized according to the maximum connection variable in the rule networklabel2 - the edge idcolor - the edge colortitle - information visible on the tooltipwidth - the edge width, defined according to the normalized connection valueRule networks are constructed using a visNetwork package that enable to add/change node and edge properties, e.g. change the color of nodes or the edge type according to the specific conditions. We are not limited to the existing variables in a rule network object, but we can add variables that are implemented in visNetwork. See ?visNodes and ?visEdges for a full list of available options.
We identified 11 genes previously reported in databases of autism associations: SFARI, AutDB and ASD. We would like to mark that genes as stars. The example shows how to do this:
#genes reported in databases of autism associations
aut_genes <- c('TSPOAP1', 'COX2','NCS1','RHPN1','FLRT2',
'BAHD1','NCKAP5L','PPOX', 'NGR2',
'ATXN8OS','DEPDC1')
#create the new variable that contains nodes information for 'all' decisions
nodes_RNO <- vis_out$all$nodes
#create the new vector of variables: shape. 'dot' is the default shape of nodes
nodes_RNO$shape <- rep('dot', length(nodes_RNO$label))
#mark selected genes as stars using the label attribute
nodes_RNO$shape[which(as.character(nodes_RNO$label) %in% aut_genes)] <- 'star'
#create the node object list
nodesL <- list(nodes = nodes_RNO,CustCol = c('shape'))
#rerun VisuNet with a new shape of nodes
vis_out2 <- visunet(autcon_ruleset, type = 'L', CustObjectNodes = nodesL)To rerun VisuNet with the customized object for nodes, you need to provide the original rule set and a list CustObjectNodes that contains the customized VisuNet object for nodes. CustObjectNodes includes the customized object for nodes: nodes and a vector of column names that were changed/added to the object: CustCol.
Sample customized rule network for the autistic and non-autistic young males classifier from VisuNet. Marked genes reported in databases of autism associations (constructed for min support=17 and min accuracy=88%)
Let’s assume that COX2 controls MAP7 and we would like to show the edge direction on the rule network:
#mark the interaction between COX2 and MAP7 genes
edges_RNO <- vis_out$all$edges
#create the new vector of variables: arrows. 'enabled' is the default variable for edges
edges_RNO$arrows <- rep('enabled', length(edges_RNO$label2))
#add direction to selected edge using the label2 attribute
edges_RNO$arrows[which(edges_RNO$label2 == 'COX2=3-MAP7=2')] <- 'to'
#create the edge object list
edgesL <- list(edges = edges_RNO,CustCol = c('arrows'))
#rerun VisuNet with a new variable of edges
vis_out3 <- visunet(autcon_ruleset, type = 'L', CustObjectNodes = nodesL, CustObjectEdges = edgesL)We are able to rerun VisuNet using customized object for edges by providing the original rule set and a list CustObjectEdges that contains the customized VisuNet object for edges. CustObjectEdges includes the customized object for edges: edges and a vector of column names that were changed/added to the object: CustCol.
We can rerun VisuNet using both customized objects: CustObjectEdges and CustObjectNodes.
Sample customized rule network for the autistic and non-autistic young males classifier from VisuNet. Marked genes reported in databases of autism associations and the edge direction between COX2 and MAP7 (constructed for min support=17 and min accuracy=88%)